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ABSTRACT 

With the rapid decrease in cost of genome 
sequencing, the classification of gene function is 
becoming a primary problem. Such classification 
has been performed by human curators who read 
biological literature to extract evidence. BeeSpace 
Navigator is a prototype software for exploratory 
analysis of gene function using biological literature. 
The software supports an automatic analogue of the 
curator process to extract functions, with a simple 
interface intended for all biologists. Since extrac- 
tion is done on selected collections that are se- 
mantically indexed into conceptual spaces, the 
curation can be task specific. Biological literature 
containing references to gene lists from expression 
experiments can be analyzed to extract concepts 
that are computational equivalents of a classifica- 
tion such as Gene Ontology, yielding discriminat- 
ing concepts that differentiate gene mentions from 
other mentions. The functions of individual genes 
can be summarized from sentences in biological 
literature, to produce results resembling a model 
organism database entry that is automatically 
computed. Statistical frequency analysis based on 
literature phrase extraction generates offline se- 
mantic indexes to support these gene function 
services. The website with BeeSpace Navigator is 
free and open to all; there is no login requirement at 
www.beespace.illinois.edu for version 4. Materials 
from the 2010 BeeSpace Software Training 
Workshop are available at www.beespace.illinois. 
edu/bstwmaterials.php. 



INTRODUCTION 

Automatic classification of gene function enables com- 
puters to complement curators for new genomes, by de- 
veloping new pipelines for annotating genomes. As 
sequencing becomes cheaper, task-oriented genome 
analysis will become common, studying dynamic expres- 
sion of differential genes for particular tasks for particular 
organisms. Fixed microarrays are already being replaced 
for expression experiments by high-throughput sequenc- 
ing using RNA-seq methodologies. The source of func- 
tional information still primarily resides in the biological 
literature, but automatic extraction is necessary due to the 
range of experimental tasks now feasible on organisms 
with unknown genetics. The original proposal for the 
BeeSpace project stated testing on honey bees the goal 
of 'biotechnology enables routine expression analysis 
and bioinformatics enables functional analysis uncon- 
strained by pre-existing categories'. 

The BeeSpace Navigator is a prototype software for 
interactive analysis of gene function, via semantic indexing 
of biological literature. The software enables users to 
create spaces, which are literature collections targeted 
toward the gene functions of a particular task. Available 
services analyze gene lists from expression experiments 
and summarize gene functions using descriptions ex- 
tracted within specified spaces. These services support 
automatic analogues of curator processes of reading lit- 
erature for creating databases, with accuracy high enough 
for practical utility. Using the analysis environment 
enables biologists to interactively transform unknown 
genes into known functions. 

The semantic indexing enables interactive sessions to 
discover gene functions, using statistical operations on 
user-specified collections task specific for the user's experi- 
ments. Indexing details are discussed in separate papers 



*To whom correspondence should be addressed. Tel: +1 217 244 0651; Fax: +1 217 244 1781; Email: schatz(rfillinois.edu 
© The Author(s) 2011. Published by Oxford University Press. 

This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/ 
by-nc/3.0), which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited. 



Nucleic Acids Research, 2011 , Vol. 39, Web Server issue W463 



for the individual components as described below; this 
article focuses on their integration into an interactive 
system for functional analysis on community spaces. See 
Figure 1 for schematic of the services interacting with the 
spaces for BeeSpace Navigator. All services uniformly 
revolve around spaces, which represent a specific point 
of view for the specific task at hand. The paradigm of 
space manipulation is explicitly embedded into the user 
model of creating spaces, with Filter and Cluster, then 
navigating them to support the functional analysis, with 
Summarize and Analyze. 



COMMUNITY SPACES 

The BeeSpace Navigator is an analysis system for creating 
and traversing spaces. It represents the first fully fledged 
implementation of the Interspace, the next-generation in- 
formation infrastructure for the Internet, where the fun- 
damental paradigm is concept navigation of community 
spaces rather than document search of scientific literature 
(1). The technologies of the Interspace promise a new level 
of functionality in building analysis environments for 
functional genomics (2). 

In the BeeSpace system, a 'space'' is a collection that is 
semantically indexed. Every service operates on spaces. 
Every collection of documents generated in any way can 
be saved as a space. The spaces are automatically named 
for user convenience, but can be renamed for special 
purposes. For example, a search query generates a space 
for the user with the search string as name. The spaces are 
used as the input to the services, for example, the Genelist 
Analyzer and the Gene Summarizer operates on the 
selected space. 

The easy selection of the underlying space is a major 
difference between manual and automatic curation 
processes, between 'base' systems such as FlyBase and 
'space' systems such as BeeSpace. This has to do with 
the underlying collection from which the function facts 
are extracted. With manual curation, there is a single 
static collection. In the FlyBase case, this collection is a 
selected subset of the MEDLINE articles that mention 
Drosophila melanogaster . 




Figure 1. Interaction schematic of system usage, services interacting 
with spaces. A 'space' is set of scientific documents for a particular 
task, (a) Search (or filter) any system or user space and make new 
smaller spaces, (b) Use terms of interest as seeds to cluster together 
related terms, in a space; save the documents associated with a given 
cluster of terms as a smaller space, (c) Summarize functional informa- 
tion about a gene, in the context of a space, (d) Analyze gene lists in 
the context of a space to reveal significant 'concepts' and genes present; 
and select concepts and/or genes to make new smaller spaces. 



With automatic curation, there can be multiple dynamic 
collections. That is, different collections can be used for 
extraction for different tasks. For example, a gene can be 
described with the Gene Summarizer using a genetic 
model {Drosophila melanogaster), the organism being 
studied (Apis mellifera) or the behavior being studied 
(early foraging). The gene will match different sentences 
and different functions may result. Thus, task-dependent 
curation can be achieved, using the automatic extraction 
from different collections. 

The current master database is all MEDLINE abstracts, 
but other biological or scientific literature would work 
equally well. For example, previous versions of BeeSpace 
Navigator had substantial collections from Biological 
Abstracts and from beekeeping books. The statistical 
technologies for semantic indexing have been utilized 
across a variety of subjects over many years with good 
results (3,4); they rely upon consistent terminology for 
contextual redundancy. 

Figure 2 illustrates the nascent BioSpace within the 
system, with spaces across all of biology. There are 
spaces derived from the Universe, which is currently all 
of MEDLINE. Some of these spaces focus on organisms, 
such as Apis mellifera or Tribolium castaneum, whose gene 
expression is being analyzed. Some of these spaces focus 
on behaviors, such as foraging, which represent specific 
tasks being studied. Note the foraging space includes 
this behavior within both insects and mammals. Finally, 
some of these spaces are broad categories, such as insect 
or mouse, primarily useful as standard platforms to 
leverage new spaces. These spaces are moderated by the 
administrator; user spaces can be migrated into system 
spaces as appropriate. 

All system services utilize spaces, which are collections 
of documents generated by internal operations. This 
concept is a more general research version of the currently 
popular commercial concept of smart folders automatic- 
ally generated with simple searches. The basic operations 
enable users to create spaces in various ways, and then use 
these spaces to analyze gene functions. A common user 
session is thus to generate an appropriate space describing 
the task at hand and then operate on this space with ap- 
propriate services. For example, search for foraging within 
insect collection then cluster to find the set of documents 
relating to foraging within social insects. After saving this 
set as a space, use that space to analyze the functions of 
genes from an expression experiment concerning honey 
bee foraging. 



SPACE CREATION 

Building new spaces from old is an important part of the 
new paradigm. As mentioned, users can subsearch within 
their previous spaces to create smaller spaces or utilize 
space arithmetic to merge together several spaces into a 
combined one. But users can also leverage the spaces 
found by other users. This feature represents a powerful 
form of information sharing. No login is required, since 
spaces are default to the guest account, but users can login 
to create spaces that are categorized within their own 
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Figure 2. System spaces provide standard starting collections for new task spaces within the BioSpace. 



accounts, enabling them to save and manage their spaces. 
The privacy settings permit spaces to be hidden, but en- 
courage openness for scientific progress. Indeed, building 
new research upon old research is the paradigm of science; 
this is directly represented within the space paradigm 
using community spaces as the knowledge unit. 

The system services for space creation are general. 
Searching is a mechanism for partitioning spaces via the 
Filter service, there are also powerful features for cluster- 
ing via the Cluster service. These have been optimized via 
parallel computation on shared memory servers, to 
operate interactively with appropriate-sized collections. 
The Natural Clusterer works from the bottom-up, parti- 
tioning on the basis of phrases occurring in the docu- 
ments. Using small worlds algorithms, it operates 
interactively on collections in the range of 50-100 K docu- 
ments (5). The Steerable Clusterer works from the 
top-down, partitioning on the basis of phrases weighted 
by words specified by the user for each cluster. Using 
language models algorithms, it operates interactively on 
collections in the 5-10 K document range (6). Individual 
clusters can be saved as spaces by the user to feed into 
other services for functional analysis. 

The real-time interactive clustering is possible only 
because of modern computing technology. Clustering 
involves computing distance in an /7-dimensional space, 
so speed is proportional to size of the memory. The pro- 
duction system for BeeSpace Navigator is hosted on a 
$70 K parallel computer with 128GB memory at http:// 
workerbee.igb.uiuc.edu. This is double the memory of 
the S7M super computer that hosted Interspace 
Prototype for the first semantic indexing of MEDLINE 



a decade ago (7). The project server has fewer processors 
than the supercomputer did, 16 versus 128, but the 
software is largely memory bound on the shared-memory 
machine. The implementation gains its interactive speed 
by locking the pre-computed indexes into memory. 



GENE LIST ANALYSIS 

Genomics has progressed from single genes to multiple 
genes, as the wet lab technologies have improved with 
microarrays and next-generation sequencing. The dry lab 
technologies have to correspondingly progress from single 
genes to multiple genes, to support functional analysis. 
The Genelist Analyzer is the automatic computational 
analogue of a genetic classification database, just as the 
Gene Summarizer discussed next is the automatic compu- 
tational analogue of a model organism database. 

The Gene Ontology (GO) is a hierarchical classification 
for biological topics used by human curators to specify the 
functions of genes (8). The major model organism data- 
bases are committed to GO and use the GO categories 
to specify additional functions for the curated genes. 
Functional analysis for microarray experiments is 
commonly supported using GO, by translating differen- 
tially expressed genes into the closest model organism 
using ortholog computations and then looking up the 
major classifications of the corresponding genes in the 
classification. 

The Gene Ontology has the same limitations as other 
biomedical classifications, being dependent on the litera- 
ture curated by the model organism database projects. For 
example, the BeeSpace project performed large-scale 
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microarray experiments on social behavior in honey bee. 
There was thus difficulty in obtaining good results from 
GO, since the closest model organism was fruit fly, an 
insect but one with little social behavior. Recent analyses 
of GO nodes have also shown that annotation quality 
varies across model organisms (9) and that functional co- 
herence of hierarchical terms is not uniform (10). 

In contrast, the BeeSpace Navigator uses a computa- 
tional approach to gene classification based upon current 
terminology contained in biological literature. Just as 
full-text search can extract the most specific and most 
current documents, automatic curation can extract the 
most specific and most current categories. Since 
discriminating concepts are uniformly extracted from 
whatever terms are represented within the specified litera- 
ture, there is no bias from existing classifications. The 
Genelist Analyzer thus trades accuracy for currency: it 
uses the free text of specified documents to extract 
concepts that frequently appear with genes. 

The Analyzer transforms gene names into concept 
names (11). The concepts identified are those with differ- 
ential expression against the background, just as the stat- 
istical model for differential expression in microarray 
experiments. But concepts are identified, rather than 
genes, and the background is the specified space that de- 
scribes the current task being analyzed. The ability in 
BeeSpace for space-oriented computations enables some 
of the accuracy loss in automatic extraction to be over- 
come, since the concepts are extracted only from the docu- 
ments within that space. Just as different backgrounds 
produce different expressions with microarray experi- 
ments, different spaces produce different lists of concepts 
given the same lists of genes. 

BeeSpace Navigator is an interactive system with select- 
able spaces, where the background for differential expres- 
sion can be rapidly changed, providing support for dry lab 
experiments. Since every document collection created 
within the system can be transformed into a space, sets 
of concepts can be sweep selected and the documents 
associated with these concepts saved into a space. The 
saved space is then fully usable within the system and 
can be filtered or clustered to produce relevant subsets. 
For example, the most frequent concepts or the most 
frequent genes that are differentially expressed from a 
microarray experiment can be saved into a space, and 
then this space clustered using Steerable Cluster to find 
the documents most relevant to the behavior task. This 
dynamically created space can then be used as background 
for gene list analysis. 



GENE SUMMARIZATION 

Once a gene list is analyzed within a space, individual 
genes can be described using functional information con- 
tained within the articles of that space. BeeSpace began 
with automatic reproduction of FlyBase summarization 
(12). In FlyBase, gene descriptions are generated from 
database entries specified manually by human curators, 
who read literature articles to extract facts. The automatic 
curation process in BeeSpace Navigator was developed 



into a production service called the Gene Summarizer 
(13). This takes a gene name as input and produces 
relevant sentences describing the gene's functions as 
output. 

During early system usage, it became clear that within 
an interactive system, showing the sentences from which 
the facts were extracted was more useful than simple 
summaries, since it enabled the biologist users to fully in- 
terpret the document context. The context for utility for 
biologists is necessarily more detailed than that for 
curators since they are generating hypotheses rather than 
extracting facts; the language processing is necessarily 
deeper than interactive support for FlyBase curators (14). 

The categorization for the summarizer relied upon a 
semi-structured machine learning algorithm, derived in 
part from those for newspaper summarization. 
Development required new algorithms for deriving and 
training to classify sentences into functional categories 
(15). These categories were initially taken from FlyBase 
summaries, later refined by the curators themselves 
through our collaboration with the FlyBase project. The 
categories of gene function are: Protein Domain/ 
Structure, Homologs/Orthologs, Expression Patterns, 
Regulatory Elements, Phenotype/Function, Genetic/ 
Physical Interactions and Population Genetics. 

The original plan was to have FlyBase curators provide 
training sets through our collaboration. But in comparing 
the recognitions of the professional curators to those of 
the practicing biologists who were working closely with us 
on developing the analysis system, we discovered that they 
differed substantively. Curators wanted only unambigu- 
ous facts, whereas biologists wanted useful observations. 
For example, a sentence speculating about a function was 
eliminated by the curators as uncertain but marked by the 
biologists as interesting. 

We chose those biologists closest to our projected user 
population, graduate students and postdocs in entomol- 
ogy and neuroscience, as trainers to judge the categories of 
sentences. Semantic indexing using the trained recognizers 
was performed during 8 hour batch jobs on the National 
Science Foundation Cloud Computing Testbed hosted at 
the University of Illinois at Urbana-Champaign, making 
BeeSpace its largest internal user. Later, we were able to 
develop separate software for question answering (4), not 
included in version 4, which extracted entities and rela- 
tions to generate biological facts, so closer in semantics 
to human biologist curators. 



EXPLORATORY ANALYSIS OF GENE FUNCTION 

The software has gone through four versions over 6 years 
and has been used by over 100 biologists, including local 
researchers, workshop trainees and meeting attendees at 
Arthropod Genomics Symposia. The production Version 
4 was early released in January 2010 and fully released in 
June 2010. 

Training materials are available at www.beespace.uiuc. 
edu/bstwmaterials.php, including overview talks and 
sample sessions from the Training Workshop held in 
April 2010. The software itself is a mix of C, Python 
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and Java. The user interface runs within a web browser 
using ExtJS javascript, an AJAX toolkit. A recent version 
of Mozilla Firefox or Google Chrome is recommended for 
best performance. No login is required for usage as guest, 
but if the user logs in, then generated spaces are saved 
under their login for future utilization. 

This interface enables all the services to be used in 
concert with each other, demonstrating the utility of a 
space-based system. Every service can create a new space 
during the interactive session, whenever a relevant collec- 
tion of documents is created. Thus the whole of the system 
is greater than the sum of the parts. In particular, com- 
munity sharing of discovered knowledge about gene func- 
tion is an integral part of the system, since biologists can 
reuse existing spaces. 

As an example session, following are representative 
screendumps of the analysis performed by the first 
author, a BeeSpace biologist postdoc, for her gene expres- 
sions on behavioral maturation in honey bee (16). Only a 
few key screens are shown; for a full step-by-step usage, 
see the sample sessions in the Supplementary Data, with 
each step annotated. 

The gene list extracted from these experiments and used 
below in the session can be found at www.canis.uiuc.edu/ 
~schatz/GAtest.txt. The gene list must consist of record 
numbers from model organism databases, here FlyBase 
(FB), but also including Mouse Genome Informatics 
(MGI) and Saccharomyces Genome Database (SGD). 

In the session, the biologist did multiple Filter oper- 
ations with space arithmetic to build a task space on 
behavior maturation containing 46 340 of the 69 154 docu- 
ments for insect. She created separate spaces for social 



insects (sociality within insect space) and for behavioral 
maturation (maturation within animal space), then inter- 
sected these custom spaces into a task space. 

Now that the background is selected, the exploratory 
analysis of gene function can commence. Figure 3 shows 
the Analyze service where the gene list from the expression 
experiment has been entered into the box and the concepts 
within the background space analyzed. The most discri- 
minating concepts are rank ordered in the left lower pane. 
These are the terms that differentially occur most often in 
documents with genes from the list mentioned as opposed 
to terms in other documents. The genes themselves are 
rank ordered in the right pane, with names translated 
from the FlyBase ortholog numbers specified as input, 
as described elsewhere (11). 

To examine the results, the top genes are selected, 
taking the 10 most frequent after skipping the first two 
as artifacts. A save is then issued, to create a space of the 
documents containing these genes within the background 
space. The new space resulting from the top genes 
differentiating the gene expressions within the maturation 
task can now be used for further exploration. Figure 4 
shows the Cluster service operating on the 2314 docu- 
ments. At this scale, the biologist can partition semantic- 
ally along desired lines using the Steerable Clusterer. Here 
the user partitions along the lines of the top concepts dis- 
covered during analysis, namely 'shock', 'diapause', 
'repair', and adding the important 'hormone'. Note 
diapause was discovered automatically, and is relevant 
to behavioral maturation in response to environmental 
conditions. Within the 'hormone' cluster, underlined 
terms such as JH for Juvenile Hormone were 
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Figure 3. Analyze transforms a gene list into a list of discriminating concepts within a space. 
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automatically recognized as genes during the semantic 
indexing (17). 

If the biologist wishes to discover the function of genes 
discovered during the exploratory analysis, she can sum- 
marize what knowledge is known within the literature. 
Figure 5 shows the Summarize service, where it has been 
implicitly invoked by clicking on the gene name under- 
lined as juvenile hormone. The Gene Summarizer then 
simulates the curation process for the specified gene on 
the specified collection. In this case, the specified collection 
was the task space that resulted from exploratory analysis 
of the experiment for gene expression. The computed 
categories of gene function include gene phenotypes and 
gene interactions. The automatic curation enables a 
task-centered analysis of gene function rather than an 
organism-centered, which fits better with experiments 
such as this on dynamic expression for animal behavior. 

The process of concept navigation can be continued by 
opening a document such as highlighted one on Manduca 
sexta and clicking on an underlined gene name, such as 
EcR for Ecdysteroid Receptor. Now that the biologist has 
explored from bees to moths, she could re-analyze the 
expression gene list within the new context of ecdysteroid 
receptors by simply creating a space from the summarizer 
results and clicking the Analyze tab to reinvoke the 
Analyze service with the same gene list but on the new 
space. Exploratory analysis of gene function is an iterative 
process of concept navigation through biological 
literature. 

As another example, another BeeSpace postdoc was 
investigating gene expression of maternal behavior in 



paper wasp (18). Maternal behavior is little studied in 
insects so running Analyze against Drosophila space was 
little help. After trying analysis with fly orthologs from the 
wasp genes, she instead generated mouse orthologs from 
the same wasp genes. A task-oriented space was created 
from the mouse literature relevant to maternal behavior 
using Filter of 'maternal' query against mouse space. Then 
Cluster was done on the 9618 documents within Mouse 
Maternal space, using the Natural Clusterer this time since 
the functional category was not known. The 33 resulting 
documents from one relevant cluster included an article 
dealing with regulation by a heat shock factor gene of 
maternal expression in the oocyte (19). 

The postdoc insect biologist found this article in the 
mouse literature of significant interest. Her GO analysis 
did not identify heat shock as discriminating term for her 
gene list, possibly due to the wide distribution of heat 
shock entries across their conceptual hierarchy. The 
more focused automatic analysis discovered this con- 
cept relation. Typically, navigations with appropriate 
orthologs against task-oriented spaces produce novel 
documents of interest to the biologist. Besides insects, 
the BeeSpace Navigator has helped biologists working 
on fish, pigs and plants. 

CONCLUSION 

As the examples above show, the effective process of 
exploring function via concept navigation leverages 
existing manual classifications and existing biologist de- 
scriptions. The model organism databases are essential 



W468 Nucleic Acids Research, 2011 , Vol. 39, Web Server issue 





File Edit View History Bookmarks Tools Help 

^99^ C X A (Z] htW/wofkerbee.igb.uiuc.edu:8200/ 


t> -| |Ot- Yahoo P\ ^ 


Ll BeeSpace Navigator, version 4 | * | 


0 Logged In as schatz ^Logout _ Send feedback 


Space Panel ! ? 

Matthews kathy • 


| Rlter | Cluster Summarize | Analyze | 

Search: juvenile hormone * fi ? 


Munoz-Torres Monica 
Newman Katie 


Protein Into; 3 results - 


Ronge Matt 
Sanders Barry 


Horootog Info: O results " 
Expression Pattern: 0 results - 


Sanogo Osee 

Schatz Bruce L 




Regulatory Element Info: 1 results - 


Toth Amy 




Phpnotyne and Function Info: 59 results 




Woodard Sarah 








B Spaces ■ ■ -Hf 


a 
a 


10 The juvenile hoimone [JH) binding activity of the extracts was measured using a hydroxy apatite assay and the purification progress was monitored by natKie gel EftHMnHBraJh) nd -r 1i.im dodecyl sulfete- 

11 A prohormone (P1) of locust ad pah net . hormone I [AKH I) is shown here to be a homodimer of a 41 residue subunit called the A-chain. 




Remove X Rename . Operations 

U J TOm ■ Sam 


£ 
£ 


12 Jlv-t le I uiij 1 t L 1 1J1 ■ i_i di Air -. Ll-d J ti -.'/in ix:rsc:ed iorti \l 5 ■iseirul r pi jI lh-Jl i -Jtsj^r: 1 j-jlb:b Sl mi A nit 1 oitCji a =nd ',1s Jr.eian-fi itlc Lii.ksi; G-i, Ilb b rnscjlsl.t 

13 Prothoracicotropic hormone (FTTH) of the silkworm Bombyx man. was punfi ed and its pnmary structure detemrined for the most part 


682 Flybase Reference 47570 t 


£ 

±1 


14 C'Uatacear ri 1- 11 nLnti ic 1 ur ■ 1 ji 1 (1.11 1 = P'u y -": P. '1- -'i jdLCJC Lv e:u y lie =■ 1= II- cji 4 1= ^Lupiti •_. 11 e b i:Ns = A 1 .■_ ,■ : ItIi. ic i ilII il 1 u 11 uibb by pjre-: ' - ji _|= 1 1 - 

15 Sirrn t c.irp-i I : ;i ts If -arli a-:rd = mrp n^l -hrtnph ir-= s aral ■■=!= r- rirtPirs. tnr itprtad --lh r-v^a p1 apvprsl mrso-ll ihr p-y.=. t- =i-rl Ivir na n- -:-p. -rptpc pir-p rs wiim -par:p-.1 ,v:L sit hncip^ :r p.. it 


683 Insect 69154 


±1 


16 F-li' ri irrmrn.-' FH: ^ n n"n1rT:r r 7T prl 1r l i-.l irl-- ?r In- it-p-I -i-T-.n r '-y-trrr ti -n rn-r fir -tr-r r,'pn1 'rhrn- r .r ■ 1 n nr irirrj I|-p i1 -Mr i/t r -i nr ! n- ,v-H n- :n n-t ;-:tr ri'.v 'oir-* 




684 Apis Mellifera : forage U Apis mellifera 2215 
686 Universe . Apis mellifera . forage 30 


a 

£ 


17 The equiliur umdfsac nation cm ats il .. foi juvenile hormone 1 and II were approximately 6 x 1 □(- 1 □) M at 4 degrees C. while racemic juvenile hormone III disp ayed an equilibrium dissc .ul on constant of 1.9 > 

18 lr u-c-61 II Bli.d-; .he dUbi IilhI Leu fit tttiuc r 1. Iric 1 uviLiie 2)--\\ Ji j:-.v=l lI ■= ji ri -e V- =:mio 'i .IL-iilI jlj-j lIe* r l. =r_ It =.1 jp.-i'.' 1 jll-ltn jl'i jL NA. : . el jl J 114 f.n ji J- 1 i?l epf. 1 L-l— -Jk: aid = eai 




687 Unrverse : Apis mellifera - forage U Universe : Aprs 2186 
701 Universe - Insect - maturation 500 


* 

£ 


19 ^ In the model insect Uanduca sexta the production of ecdysteroids by the prothoracic gland is acutely controlled by a brain neurohormone, prothoracicotropic hormone (FTTH). 

20 Allatectomized animals treated with juvenile hormone III (JH) or methoprene caused the appearance of vitellogenin in the hemolymph, but had no effect on hexamehn levels. 


702 Universe : Insect ; maturation [Query expand) 27779 


£ 


21 V.'p rihtanpd a il I- =rrjM -I [-.A pir innj ri =|-.a i=p Inrrtrir 3 ( IH; air riprr n-mp L r s; ndi J = = a "li.a:ir ] iPirnripp-ir= FbA'Jj ir San s -yilna ir n La-;p1 ir ho:h ipvpt : p "lan-r p:a ip-FI (H 1 -PC-) 1 an.- 




703 Universe Insect maturation Top Concepts 1222 


£ 


22 Cinlrtnpir r-lr i-.iir ir rpir !■ "nRhj rf-n-inr ifr. p .-.[--- 1 ,- .-- - v mt ir^rtiir-l r it p-njr-". i- bc-ir 3 nr- ir t ir i-;r n:ir n ,nr. r i-| - -t.-nrati'ir .-- fnrtr r- 1 h ri -;t t 1 1 it ■-■ r A'mI^ pnrir.t n- ni, tto 0 'a" 1 , 




704 Universe . Insect . maturation: Top Genes 5904 


£ 


23 P'lT.h jr jlic A lli-j ■ii.ii ji 0 Iil l lk*jr:i D r_ r 1 b i-. r c ■ 0 i jmii 11 J ..u^lo-ilc j-c c me 'it v.UkjLjrc 




765 Unrverse : Insect : maturation Top Concepts 1894 


£ 
£ 


24 Within the 2.2 Kb regicn between hsp23 and gene 1 of the small heat shock gene locus 67B1 of Drosophila melanogaster. an approximately 1 kb perturbation of the chromatin architecture has previously be 




767 Insect : behavioral maturation 1393 


25 I ip rat.iral y i-r inrg -nartnmp- it in vi\ p lni~i t= Hi" ./as r ir,p-t J r fi n^-ih vrnn- II a-id ard :L?- r n ,a Pit y ada- 1 H:i arnrnipj: ■ 1 - "S = f i -.r ?= 4h 


753 Ircoct "&ch3\"0"j mciint oi" j jilt : l -'ch e '01(1 
769 Insect "behavioral maturation" adult juvenile . "bet 8762 

772 Insect : maturation adult 8023 

773 Insect : maturation adult leishmania sc hsp19.5 ( 360 

774 Insect maturation adult ; fru. ACXD. TsfL 18w. SI 646 

776 Unrverse Insect maturation [Query expand): chic 2175 

777 Flybase : Reference : maturation (Query expand]: I 1S91 
773 Insects . merger, behavioral maturation). HspB3. Rh 3647 
fS2 Irs8d nprce Le i; .1 : 13 ri=: L -s:ii:n 1 1: p Genes 2314 


£ 


26 Moreover, the prewandering peak of the plasma enzyme juvpnile homionr. eeteraB b JHE] » as blocked by CsPQV injection 
Interaction Info: SS results 







Figure 5. Summarize describes the gene functions of specified gene within a task-specific. 



to turn expressed genes into gene names, which can then 
be correlated within the biological literature to extract 
gene functions. BeeSpace is the first major system to 
build spaces upon bases, to utilize scalable semantic 
indexing to automatically extract functional concepts 
and categories. Space creation is a uniform operation, 
enabling community sharing to build new collections of 
related documents from old. 

Such systems will be used to support hypothesis gener- 
ation to rapidly compare gene function using different 
spaces to discover new classifications. Currently, most 
gene lists are generated via microarrays specialized to par- 
ticular organisms. But the rapid increase in sequencing 
power is changing this paradigm to one where gene lists 
for specific tasks can be generated with general RNA-seq 
(20). Even an early system such as BeeSpace is just as 
effective for RNA-seq as for microarray, since it deals 
only with gene names and gene lists. Exploratory 
analysis by knowledgeable biologists leveraging existing 
databases is the viable path towards understanding gene 
function in the new era of dynamic expression of organism 
behavior. The interconnected space of biological know- 
ledge will be created by functional analysis from insects 
such as fly and bee, beetle and wasp, then move to 
InsectSpace and ArthropodSpace then AnimalSpace and 
PlantSpace en route to creating the BioSpace. 

The future system will change its name to BioSpace 
Navigator, reflecting this more general mission. As 
noted, the software has no embedded knowledge of bees 
or insects, merely general text mining. Future plans are to 
integrate the navigator functionality described in this 



production version into a fully fledged gene function dis- 
covery environment. Enhancements will include enhanced 
speed for services and periodic updating of sources, im- 
porting gene lists and exporting document sets, computing 
orthologs and linking to genome browsers. 

SUPPLEMENTARY DATA 

Supplementary Data are available at NAR Online, giving 
a step by step interactive session with BeeSpace Navigator. 
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